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Stock price analysis appropriately is a challenging area of research as many 
factors directly affect stock prices. As a result, so not easy to analyze to 
identify stock trading signals appropriately. The proposed approach builds a 
framework for classifying stock trading signals by combining natural 
language processing with technical analysis. The dataset implemented focuses 
on corporate news and stock indicators from 01-01-2019 to 31-12-2021 from 
the eight corporates of the Thai Industry Group Index and Sector Index. Two 
traditional machine learning models, multilayer perceptron (MLP) and 
support vector machine (SVM), and four deep learning models, Bidirectional 
GRU (BiGRU), bidirectional LSTM (BiLSTM), gated recurrent unit (GRU), 
and long short-term memory (LSTM) used for comparison purposes. The 
training model classifies daily trading signals into three classes: buy, sell, and 
hold-after that, the model’s efficiency evaluates by measuring accuracy, 
precision, recall, and Fl-score. For the results, classification average 
efficiency in all models showed that the BiGRU model obtained higher 
average accuracy (0.93), precision (0.93), recall (0.93), and Fl-score (0.92) 
than other models. Therefore, the BiGRU model was appropriate for our 
experiment and was applied to determine daily trading signals for analyzing 
investment returns. 
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1. INTRODUCTION 


Stock, also known as investment units in which each unit is equal in value, is a type of financial asset. 
Stockholders will become owners of the business and have property rights and income. Besides, the 
stockholders may receive returns and dividends, depending on the profit and the agreement of that business. 
Stockholders must rely on up-to-date information and market prices that respond to new information, such as 
daily stock prices, financial news articles, and others [1]. The central marketplace includes listed companies 
known as the stock exchange, which acts as a publicly traded company and develops related work systems for 
stock trading. The Thai stock market is known as the stock exchange of Thailand (SET) [2]. Besides, a tool for 
tracking the stock market's price levels and trends over time series is called a stock price index. The time-series 
data are extensive data with specific behaviors and properties. However, stock indexes also have multiple 
factors that influence future daily stock prices data analysis, such as high volatility market conditions, a wide 
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variety of investors, many funds, plenty of retail stocks, the volume of purchases and sales, holding stock of 
foreign investors, and investor holdings holiday volatility in Thai stock market [3], [4]. One issue that investors 
consider is identifying daily trading signals of corporate stocks. For example, find out how a company's stock 
price movement relates to opinions on that company [5]. Therefore, investors consider financial stock data, 
indicators, and other factors. However, these factors derive from various sources that are hard to analyze stock 
trading signals appropriately for investors. Thus, some researchers use technical analysis stock indicators in 
their research to help investors analyze stock trading, for example, using technical indicators with the best 
features for stock prediction [6], predicting stock price trends based on neural networks with Stock technical 
indicator data [7], Indian stock market forecasting with deep learning using technical indicator classification 
[8]. The results of various research that researchers have created can help guide investors in analyzing stock 
forecasts initially. 

In addition, another essential factor that directly affects the companies’ stock prices considered by 
investors is corporate news, which presents and disseminates facts or events to internal companies and third 
parties to gain business confidence. Mainly corporate text data related to stock indices reflect their impact on 
the corporate so that investors are aware of trends or directions occurring in the corporate. So, the volatility 
pattern of the stock price index could be more stable. Besides, it is a challenge to analyze future investments. 
As corporate news text that uses the Thai language, there are often problems with the grammar of the Thai 
language that is simple but has language ambiguity [9], [10], such as homonyms, slang, punctuation marks, 
word error, word order, word abbreviations, and incomplete use of the language. Thai structures are no spaces 
between words for word segmentation [11], so word segmentation depends on the boundaries between words 
in the sentence for correct segmentation [12]. A lack of sufficient Thai Corpus to accommodate support for 
every Thai word and an incomplete Thai corpus can cause misinterpretation [13], [14]. Therefore, future 
analysis and investments influenced by corporate news, stock indices, stock indicators, or other factors can be 
challenging to analyze appropriate investments in the future. 

This paper offers a combination of machine learning based natural language processing with technical 
analysis for stock trading. Our proposed framework uses corporate stock news that is fully effective and 
combined with the stock indicator strategy technical analysis to feed as input to machine learning models. 
Besides, we selected eight stocks and corporate news from the Thai Industry Group Index and Sector Index 
[15], namely the services group, agro and food industry group, resources group, industrials group, financials 
group, technology group, consumer productions group and property and construction group, as the input 
dataset. After that, we compared two traditional machine learning classification methods: multilayer perceptron 
(MLP) and support vector machine (SVM), and four deep learning classification methods: long short-term 
memory (LSTM), gated recurrent unit (GRU), bidirectional LSTM (BiLSTM), and bidirectional GRU 
(BiGRU). For classification performance measures we used four popular metrices including accuracy, 
precision, recall, and Fl-measure. Finally, we applied a model to define daily trading signals for analyzing 
yearly and quarterly investment returns. 

The following are the principal contributions of this paper: i) Word tokenization is used to a word 
from the word list in the Thai corpus and added words appropriate into the Thai corpus to optimize the 
tokenization process. ii) Multiple indicator strategies are associated with daily stock price data to enhance the 
accuracy of the analysis. iii) Configuration of the dataset class using statistical methods by daily voting with 
the value of most trading signals from all corporate news and stock indicators. iv) Traditional and deep learning 
models are trained with combination datasets to classify stock trading signals. v) Analyzing the initial 
investment approach using a model created to analyze yearly or quarterly investment returns. 

Following is the remainder of the paper. Firstly, in section 2, we review the state-of-the-art literature. 
Next, we give a detail of the proposed method in section 3. In section 4, we describe the results and discussions. 
Finally, the conclusion appears in section 5. 


2. LITERATURE REVIEW 
2.1. Stock indicator 

A stock indicator is a tool that calculates both current and past stock data through mathematical 
principles [16]. To create information that will help investors to analyze both the trend and the volatility of the 
stock price in the future. Such information is directly related to time-series data of stock prices in the stock 
market, volume, or trading reference index at the time of interest. Typical investors use the following popular 
stock indicator tools: 

Moving average (MA) is an example of a moving average computation. MA is computed from the 
historical stock price at the specified time (period) and taken to find the average [17]. For example, a five-day 
moving average is an average of the stock price for the last five days to the current date. One of the indicators 
used widely in technical analysis is the MA, which includes simple moving average (SMA) and exponential 
moving average (EMA). The details are as follows: 
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- SMA can be used to create and one of the critical indicators in technical analysis. It uses a sliding window to 
take the average over some periods. Thus, finding the moving average will weigh all the values calculated with 
equal importance and are weighted equally when each period ends. Although SMA is an easy-to-use tool 
characterized by the movement of price changes [17], SMA is weighted equally calculated, resulting in changes 
in response to slow price movements and away from prices. 

- EMA focuses on the different moving average variables. The idea is that the historical price of the latest date 
is more important than the previous day [18]. Therefore, EMA emphasizes the current price more than the price 
from the previous day. As a result, the EMA moves closer to the stock price than the SMA, which uses the 
same set of data and the same preferred period. 

Relative strength index (RSJ) represents the stamina of price movements [19]. In deciding to trade, 
we calculate the RSI with the 14 periods. RSI represents a speed price movement with values ranging from 0- 
100. The most common standard values are 30 and 70. If the RSI is below 30, the price is considered oversold. 
Moreover, the price is overbought if the RSI is greater than 70. 

Moving average convergence divergence (MACD) is a stock price movement indicator showing the 
relationship between two EMAs of a stock time price series [20]. The MACD indicator comprises the MACD 
and the Signal (Syacp) lines. The MACD line is computed by subtracting the 26-period EMA from the 12- 
period EMA of stock time price time series. In addition, the Swacop is a plot by the 9-period EMA of the MACD 
line. 

Stochastic (STO) is an indicator suitable for short-term investment analysis. STO can be used to 
provide trading signals, either overbought or oversold conditions. This technical indicator presents the stock's 
closing price concerning the high and low stock prices, ranging over a specific duration, typically 14 days [21]. 
Two lines represent the equation for calculating %K and %D, which are values between 0 and 100. The %K 
line is the stochastic line, and the %D line is an average of the %K line. 

Bollinger bands (BBAND) is a stock indicator technical analysis tool that shows price fluctuations 
over time. John Bollinger created this tool in the 1980s. Investors use this chart to make decisions on technical 
analysis [22]. BBAND is commonly used to analyze short-term charts such as daily or weekly. It visualizes 
the highest or lowest volatility range and composes three lines: upper, middle, and lower band. Generally, there 
are two standard deviations between the upper and lower band as + and - from a 20-period simple moving 
average. These stock indicators, which are calculated based on the formulas, are concluded in Table 1. 


Table 1. Stock indicator and formulas 


Name of stock indicators Formulas 
SMA SMAy = a LteN—we1 Pi 
2 
EMS EMA, = Kpy +EMA, 10-1) 
100 
Rel RSI = 100 — TLRS 
MACD MACD, = EMA, (p) — EMA; (p) 
Suacp = EMAy(MACD) 
C-L 
ae %K = “x 100 
H, — L 


He. ee 
%D = = — 100 


s 
BBAND Middle Band = MA(D) 
Upper Band = MA(D) + K x5S.D. 
Lower Band = MA(D) — K x S.D. 


Where p is all data points (closing price); N is data points; W is the window size; t is the period of the 
smoothing factor; n is time period; RS is an average profit divided by an average loss; E, F and M are period 
EMA of the price which is equal to 12, 26 and 9, respectively just the default settings that most investors 
commonly use; the recent closing price is C; Land H are the lowest and highest closing prices during 7 and s 
time periods; r and s is equal to 14 and 3, respectively; S.D. is standard deviation; K is the number of S.D., 
which is equal to 20 and 2, respectively; and D is the number of days. 


2.2. Natural language processing 

Natural language processing (NLP) enables computers to comprehend human languages [23]. 
Humans use natural languages for highly complex communication. Therefore, computer programmers design 
to suit the understanding of natural human language by converting them into numbers or codes that computers 
can understand. NLP emerged to bridge the gap in human-computer communication. NLP becomes the 
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foundation technology that helps build on technologies such as artificial intelligence programs that can help 
solve social problems, solve reading problems for the visually impaired, and solve linguistic complications by 
creating a sound translation system, linguistic analysis, text interpretation, and sentiment analysis on a social 
media platform [24]. The structure of the Thai language is a natural language in which words are written 
continuously without spaces between words-causing errors in the word segmentation process [9]. Thus, we 
should focus more on word segmentation in Thai sentences. 


2.3. Traditional machine learning classifiers 

The creation of algorithms enables computers to learn from data and make predictions, known as 
machine learning (ML). The algorithms are adapted to the most current available data to solve related problems 
[25]. Therefore, we can use them to make more intelligent future predictions. Machine learning uses 
mathematical algorithms to build models from training data for predicting unseen data. Then we can apply the 
predicted values to make decisions in the future. The learning process is a technique for adjusting a machine 
learning model's parameters. In most cases, it runs several epochs to lower the predicted error values-the 
benefits of making machine learning, such as customer segmentation and prediction of customer loss. MLP 
and SVM are two popular traditional machine learning classifiers. 

SVM is a machine learning algorithm for classification and regression tasks [26]. The SVM algorithm 
can classify multidimensional data efficiently. Nonlinear SVM uses for a dataset that cannot classify using 
linear. However, nonlinear SVM provides efficient mapping to a higher-dimensional space, referred to as the 
kernel trick. The (1) and (2) show nonlinear SVM classifier equations. 


f(x) = sign(Xqa1 Calas(x, Xa) + b) (1) 
S(X,Xq) = X4X (2) 


where f(x) is input data; aa is a positive real constant called a support vector; sign is the signum function; Jy is 
the labeled dataset (Ja <-1, 1); s and b represent a kernel function and a bias vector, respectively. 

MLP is an artificial neural network with the conceptual technique of simulating the human brain, 
which enables computers to learn and recall since human neural networks have many interconnected neurons 
[27]. MLP divides the perceptron into layers, and each layer passes on signals to the next layer. Thus, the input 
data flows in one direction, which fully coordinates the connection function of the connected nodes. MLP 
consists of the input, hidden layers, and output. The input layer takes values and transmits them to other nodes. 
The middle layer affecting the model's learning performance is the hidden layer, which may include many 
nodes. The last layer that generates the network's output is the output layer. The number of nodes in the output 
layer depends on the dimension of the target data. The (3) through (5) show the computations occurring at 
every node in hidden and output layer. 


a®M = fY(n) = £O(WOx 4 pb) (3) 
a@ = f (n®) = FO (w%a + b®) (4) 
y= f)(n@) = £2) (wa he b®) (5) 


Where x is input variable feed to input layer; W” is a weight matrix between neural at j” layer; n” is network output 
at j layer; b” is bias vectors at j” layer; with j = 1, 2, 3. Typical choices for nonlinear activation function is sigmoid 
function f® (n) = 1+(1+exp (-n)) is feed to a hidden j” layer, and SoftMax activation function for 
classification used in the last layer so that we get output for the output vector y. 


2.4. Deep learning classifiers 

Deep learning is a subset of artificial neural networks as part of a machine learning approach. Deep 
learning has many algorithms to simulate high-level data. Deep learning is like the biological communication 
of the brain neuron system. Deep learning represents a network with many artificial neurons [25] arranged at 
different levels of associations representing a multilayer computing network. Deep learning utilizes parallel 
processing to accelerate learning. Deep learning models are easy to use; researchers can apply deep learning to 
many applications, such as COVID-19 Outbreak Forecasts, to estimate the morbidity and severity of COVID- 
19 in Russia, Brazil, India, and the United States by comparing deep learning models with other models [28]. 
Alternatively, research on improved agricultural efficiency in plant leaf disease recognition using deep 
learning [29]. The deep learning comprises many neurons and hidden layers to a higher tier, showing more 


A combination of machine learning based natural language processing with technical ... (Phayung Meesad) 


426 im) ISSN: 2502-4752 


complex patterns. However, it also uses more processing resources. Other popular techniques of deep learning 
models are convolutional neural networks (CNN), recurrent neural networks (RNN), and transformers [30]. 

An RNN is a deep learning model that returns the computational output from neurons as input. We can 
apply RNN to sequential data and natural language processing. RNNs are useful for continuity or future 
information from previous events [30]. For RNN architecture, there are two critical components: input nodes and 
hidden states. The compute node receives input from the input data. Besides, the hidden state records the outcomes 
of the computation performed by the previous neurons. The hidden states are the additional inputs to the network. 
RNNs often encounter vanishing and exploding problems when we have a long data sequence [31]. For example, 
the word prediction using the RNN architecture applies the following words in the subsequent text. However, 
the RNN architecture receives data sequences to compute and forward to the next neuron. As a result, the 
neurons cannot memorize all the sequence data. Furthermore, the overall gradient released is almost 
imperceptible, making RNN recognition better for short-sequence data only. The formula in (6) to (8) shows 
the RNN computations. 


h, = f(hy-1,X¢) (6) 
hy = fp(Wrhy_1, WX) (7) 
yt = y(Wy, h;) (8) 


where x; represents the network's input at time step 7; h, acts as a network memory and represents a hidden state 
at time t; W;, W,;, and W, are weight matrices. They are computed using the current input and the preceding 
time step of hidden states; f represents a non-linear transformation such as tanh or ReLU; and y; represents the 
output on step f¢. 

LSTM is a solution for sequential data; it is one of the popular RNN architectures. Therefore, LSTM 
is ideal for classification and time series prediction [32]. The working procedure has a cell state or a memorized 
cell that stores the state of each neuron so that it stores the state reversible from the previous cell [33]. 
Furthermore, a gate that controls data flow is present between the input and output. 

LSTM comprises three primary gates: Forget gate, input gate, and output gate [31]. Forget gate 
controls the flow of the new input data combined with previously hidden states. We use the sigmoid function 
to decide whether to delete data by assigning the values 0 and 1. If the sigmoid value is 0, it deletes the original 
cell state to provide the cell state for receiving new data. The input gate receives input data and the previous 
hidden to pass into a sigmoid function to regulate the data. We use the Sigmoid function to decide whether to 
update the cell state or not. It assigns the values 0 and 1. If the value is 0, it will update the original cell state. 
Moreover, we use the input modulation gate to adjust the data written into the cell state by adding non-linearity 
to the data and making it zero-mean. The input modulation gate has the tanh activation function, and the tanh 
function ranges [-1, 1]. The output gate controls the new hidden states by receiving the old hidden states and 
the current input, passing into the sigmoid function. We use the sigmoid function to store or forward data to 
the subsequent neurons. If the value is 1, it exports the data. The data flow to the next neuron and then activate 
via the tanh function to control the data output with a priority from -1 to 1. the (9) to (14) compute all LSTM 
signals. 


f, = o(W,[hy_1, X¢] + by) (9) 
i, = o(W,[hy_1,x_] + bj) (10) 
C, = f£,@ c,_, + i,® €, (11) 
¢, = tanh(W,[h¢_1, X¢] + b,) (12) 
0, = o(W,[hy_1,X,] + bo) (13) 
h; = 0; ® tanh(c,) (14) 


Where f;, i, and 0, are forget, input, and output gates, respectively; o is a sigmoid activation function; tanh is a 
hyperbolic tangent activation function; W,, W., Wi, and W; represent a memory cell and recurrent weight 
matrices of the three gates, respectively; bo, b., b;, and by represent a memory cell and bias variables of the 
three gates, respectively; h; represents the last hidden layers units that are elementwise added with weights of 
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the three gates; €, represent a candidate memory cell; c; transforms into an active memory cell; the symbol ® 
is a pointwise multiplication operation. Here, ¢ represents the current time step and t-1 is the previous time step. 

GRU has fewer parameters than LSTMs. Each cell in the GRU has a reset gate and an update gate 
that controls data flow [30]. The reset gate regulates the incoming input signal and the previously hidden state. 
Moreover, the update gate also controls the storage of previous state data, a combination of the input gate and 
forgets gate of LSTM architecture [34]. Therefore, The formulas of GRU as shown in (15) to (18). 


r;, = o(W,[hy_1, X¢] + b,) (15) 
h, = tanh(W,[r, @ hy_1,X;] + bp) (16) 
Z, = o(W,[hy_1,X;] + b,) (17) 
h, = (1 —z, @[h,_, + z,] @h,) (18) 


Where z; is the update gate, regulateing data to flow to next state. r; represents the reset gate, computing 
similarly to the update gate; when the gate is set to zero, the gate reads input sequences and forgets the 
previously calculated state. Moreover, h; represents the same functionality as in the recurrent unit; h, at time f 
of GRU presents linear interpolation between the previous h;.; activation state and the current h;; b,, b,, and b, 
is a bias vector; the symbol ® is a pointwise multiplication operation. 

A GRU algorithm implementation is as efficient as the LSTM algorithm; nevertheless, GRU has a 
simpler architecture and fewer parameters to generate sequential data sets. As a result, GRU forgets the feature 
of inconsiderable data. Besides, it preserves important data for long sequences. Thus, the model can run more 
quickly. Therefore, compared to LSTM, GRU is more robust in limited data [35]. 

Bidirectional recurrent neural networks (BiRNN) originate from RNNs, suited to sequence data such 
as voice or text. Because RNNs have self-contained memory that can remember past or future contexts to aid 
analysis and prediction. BiRNN uses a deep learning model to connect the same output from hidden layers in 
opposite directions. Based on the output layer data from past (reverse) and future (forward) states 
simultaneously [32], solving NLP problems more efficiently [36]. Well-known BiRNN is BiLSTM and 
BiGRU. BiLSTM consists of two LSTMs used in the sequence processing model. The first accepts input going 
forward, whereas the second accepts input going backward. In essence, BiLSTM expands the quantity of data 
available to the network effectively. In addition, it improves the context provided for the algorithm. BiGRU is 
a bidirectional neural network that has Input and Forgets gates. It comprises two-way GRU sequence 
processing models mapping from input to output and output to input. Many researchers have used this model 
in various research studies, such as finding the effectiveness of BiRNN models in time series forecasting. The 
results demonstrate that the BIRNN model's prediction outperforms the others [32]. Furthermore, the BiRNN 
model training has positive and significant implications for improving forecast accuracy. Therefore, the BiIRNN 
model is a good candidate for time series forecasting. 


3. METHOD 

This research proposed a hybrid combination model for stock trading based on machine learning, 
natural language processing, and technical analysis. The proposed model would help investors to make 
decisions with a high chance that the stock price will trade at the right time. The observation shows that analysis 
of stock price classification is based not only on financial stock time series data including open, high, low, 
close, adj close, and volume, but also on the corporate news that affects the classification [37]. Therefore, we 
considered combining corporate news with several stock indicators to feed into machine learning classifiers to 
increase the accuracy of stock trading signals classification. We describe the conceptual framework, as shown 
in Figure 1. 


3.1. Data gathering 

The dataset implemented focuses on the news and finance stock data from selected Thai companies. 
The data collected were between 01-01-2019 to 31-12-2021 from the eight corporates of each Thai Industry 
Group Index and Sector Index: services group, agro and food industry group, resources group, industrials 
group, financials group, technology group, consumer productions group and property and construction group. 
The corporate news was collected for each corporate from www.kaohoon.com by web crawling technique, 
using the BeautifulSoup python package to collect data through corporate websites. In addition, we retrieved 
financial stock data using Yahoo Finance API. The stock price signals included open, high, low, close, adj 
close, and volume. 
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Figure 1. Conceptual framework 


3.2. Data preprocessing 

In the proposed framework, we divided data preprocessing into two parts: corporate news text data 
and stock price time series. We described the data preprocessing as follows. 

In the first part, we chose the company news only in Thai news text from the same period as the stock 
price. We discarded other irrelevant news messages. After that, we did the text cleansing process to remove 
unwanted texts and remain only necessary and important words from the news. The unwanted text included 
HTML tags like <a>...</a>, <p>...</p>, special characters like +-*/=.,$"!%&€_#~, and characters like Space 
or URLs. For example, this is a Thai sentence followed by its phonemic transcription and English translation: 


‘naatiunadrqavanis#ov0w.. [Tladhdn thiy lasud pid kar six khay ...] (The latest Thai stock market closes for 


trading ...)’. Subsequently, the Thai word tokenization process tokenized Thai texts into words in which word 
tokenization used a word selection from the word list in the Thai corpus. We used the PyThaiNLP package 
[38] for the Thai word tokenization in our experiments. PyThaiNLP default tokenization algorithm is newmm 
algorithm to determine the maximum matching of tokenizing quickly and accurately from the word list. 

Due to the Thai word tokenization problem referenced from the word list of the Thai corpus, some 
words in the sentences have words that do not match the word list, resulting in incorrect word segmentation. 
The following shows some wrongly segmented words; for example, a specific word should not be separated, 
such as ‘swrsnms!ng [Thnakhar thhar thiy] (Thai Military Bank)’ tokenized to ‘sxmz| nmz| Ine [Thnakhar | 


thhar | thiy] (Thai Military Bank).’ Likewise, a corporate name is a specific word, so the words should not be 
separated. In addition, an abbreviation of a corporate name should not be separated; for example, ‘a... {kor for 


phor] (Electricity Generating Authority of Thailand)’ tokenized to ‘n.| » | « [kor | for | phor] (Electricity 


Generating Authority of Thailand).’ The abbreviation is a corporate name; the separated words have no 
meaning. So, to solve the Thai word tokenization or the incorrect word, we added the new words and selected 
other appropriate words into the Thai corpus to optimize the tokenization process, such as abbreviations, 
corporate names, misspelling words, and specific words, which resulted in 63,678 words in the Thai corpus. 

After that, we use a word embedding approach for generating feature vectors from words. This 
technique converts words into unique index numbers by calculating the word frequency. The words with the 
most frequencies are listed first in the index number—for example, a list of words tokenized with indexed 
numbers, as shown in: 


Word tokenizing: 
é , ‘ , ‘ ds é a , fo yy?) ‘ , é , é , 
[ ‘ifyr’, ‘oan’, “amumsal’, Tada’, “inli’, “aumw’, ‘ves’, ‘uszaww’, ...] 
[‘Payha’, ‘cak’, ‘sthankarn’, ‘khowid’, ‘thahi’, ‘sukhphaph’, ‘khxng’, ‘prachachn’...] 


Meaning 
[‘problem’, ‘from’, ‘situation’, ‘COVID’, ‘affect’, ‘health’, ‘of’, ‘people’...] 
Word indexing 


[120, 233, 789, 1047, 845, 975, 86, 542, ...] 


Additionally, we prepared the datasets by setting the maximum number of sequence lengths to the 
same sequence of lengths. As in the example below, we used zero pre-padding for every data set: 
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[120, 233, 2699, 894, 1045, 99, 43, 30, 640, 429, 222, 543, 234, 69, 999, 88, 2493, 15, 16, 2519, ...] 
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 135, 429, 2699, 53, 2521, 3, 435, 212, 1234, 7132, 16, 46, ...] 
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 13, 5, 16, 1045, 8, 2519, 293, 3, 85, 343, 1205, 215, 163, 72, 44, 99, ...] 
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 19, 2576, 58, 2568, 458, 16, 32, 46, 2521, 874, 318, 45, 88, 72, ...] 
[0, 0, 0, 0, 0, 0, 0, 22, 18, 15, 28, 378, 99, 168, 34, 2565, 124, 86, 96, 2654, 47, 14, 247, 99, 247, ...] 


For the second part of the input, we considered financial stock data identifying trading signals based 
on historical stock price trends each day using the stock indicator strategy technical analysis. It is a technique 
that increases the trade outcome’s chance of being more profitable than loss. However, a single stock indicator 
cannot identify proper stock trading signals. Therefore, we used a combination of multiple stock indicator 
strategy technical analysis and daily stock price data to increase the accuracy of analysis and decision-making. 
Thus, we selected the indicators for this experiment: MA, RSI, STO, MACD, and BBAND, to define daily 
stock trading signals. A summary of the indicator strategy is shown in Table 2. Moreover, we also transformed 
the trading signal of each day into binary data, which is appropriate input data for the classification model. 


Table 2. A summary of the indicator strategy 
Strategy 


Stock indicators 


MA Sell signal: The short period EMA line passes below through the long period EMA line. 
Buy signal: The short period EMA line passes above through the long period EMA line. 
RSI Sell signal: RSI line increases above 70. 
Buy signal: RSI line decreases below 30. 
MACD Sell signal: MACD line passes below the signal line. 
Buy signal: MACD line passes above the signal line. 
STO Sell signal: %K line increase to the overbought zone, or %K line passes below %D line. 
Buy signal: %K line decrease to the oversold zone, or %K line passes above %D line. 
BBAND Sell signal: close price increases to touches the upper BBAND. 


Buy signal: close price decrease to touches the lower BBAND. 


A model from the datasets using relevant corporate news text combined with the stock indicators co- 
occurs. All stock trading signals of datasets were defined using statistical methods by voting with the value of 
most buy and sell signals from all corporate news and stock indicators daily. After that, we used the result of 
the trading signals to configure the dataset class. We divided the classes into buy, sell, and hold. Class ‘buy’ 
refers to identified data from a buy signal on that day that will most likely result in a forward trend of stock 
prices. Class ‘sell’ refers to identified data from a sell signal on that day that will most likely result in a 
downward trend in stock prices. Besides, class ‘hold’ also refers to equal identification of both the buy and sell 
signals on that day. 

Furthermore, we split the dataset into two sets, which define the testing dataset from 01-01-2021 to 
31-12-2021 and the training dataset from 01-01-2019 to 31-12-2020. Then, we applied the one-hot encoding 
method to transform the data class to binary code format. We converted class ‘buy’ to [1, 0, 0], class ‘sell’ to 
[0, 0, 1], and class ‘hold’ to [0, 1, 0]. 


3.3. Classification models 

We employed several machine learning models for designing a classification model, including 
traditional and deep learning models for stock trading signal classification. For conventional models, we used 
SVM and MLP. We used LSTM, GRU, BiLSTM, and BiGRU as deep learning techniques. We used the Scikit- 
Learn package for constructing traditional models and the Keras functional API with a Tensorflow backend 
for deep learning models. The model configurations for machine learning models are represented in Table 3. 


Table 3. Machine learning model configurations 


Model Parameter (values) 
LSTM, GRU, dimension unit (128, 64, 32, 16), maxlen (maximum word tokenizes), dropout (0.4), activation 
BiLSTM, BiGRU function (ReLu), batch size (10), epochs (10), optimizer (Adam) 
SVM Kernal (Linear), C (0.1), tolerance (0.0001), max_inter (1,000) 
MLP activation function (Sigmoid), hidden_layer_sizes (100), tolerance (0.0001), max_inter (200), 


optimizer (Adam) 


In this experiment, we constructed the traditional classifiers using default parameters for the model 


configurations. In designing deep learning, we built LSTM based classifier model with hidden layers for 
sequence, and only the last layer returns the output sequence. The input size is the maximum length with an 
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embedding size of 250. We added the dropout layers by setting the probability to 0.4 to avoid overfitting [39]. 
The dropout layers eliminate connections between adjacent layers and do not allow data to the following layers. 
We used the Adam algorithm as an optimizer with a binary_crossentropy loss function. Besides, we used the 
ReLu in the hidden layers and the SoftMax in the final layer. The layers of the deep learning model are 
represented in Figure 2. 


3.4. Model evaluation 

The evaluation model is used to evaluate the performance of a data classification-based machine 
learning model. For model performance evaluation, accuracy, precision, recall, and Fl-Measure or Fl-score 
wereused [40]. Furthermore, we compared every model to identify the one that performed the best from the 
experimental dataset. 


3.5. Investment return analysis 

In addition, we analyzed the initial investment approach using a model created to analyze the 
investment returns. We focused on stock trading signals from each company's forecast dataset in yearly or 
quarterly investments. Moreover, we assign the investment rate as 10 percent of the initial stock price. After 
that, we define entry points for stock trading from the model created for calculating the investment returns, as 
shown in Figure 3. 
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[(None, 2130)] 


input_news | input: input_indi input: | [(None, 6)] 


[(None, 6)] 


InputLayer | output: InputLayer | output: 
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spatial_dropoutid | input: | (None, 2136, 250) 


SpatialDropout1D | output: | (None, 2136, 250) 


bidirectional(gru) input: | (None, 2136, 250) 


Bidirectional(GRU) | output: | (None, 2136, 256) 


! 


bidirectional_1(gru_1) | input: | (None, 2136, 256) 


Bidirectional(GRU) output: | (None, 2136, 128) 


bidirectional_2(gru_2) | i (None, 2136, 128) 
Bidirectional(GRU) (None, 2136, 64) 
bidirectional_3(gru_3) | input: | (None, 2136, 64) 


Bidirectional(GRU) (None, 32) 


(None, 32) 
(None, 3) 


outputl 


input: 


Dense 


output: 


Figure 2. Deep learning model network architecture 
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Figure 3. Defining entry points for stock trading from the model 


4. RESULTS AND DISCUSSION 

The experimental results focus on comparing classification model performance evaluation by 
traditional and deep learning models-that evaluation by accuracy, precision, recall, and Fl-score. We found 
that the BiGRU model outperformed other models regarding average accuracy, precision, recall, and Fl-score 
values, with scores of 0.93, 0.93, 0.93, and 0.92, respectively. Table 4 summarizes the performance evaluation 
comparing the traditional and deep learning classification models. The results also display the deep learning 
models’ classification performance resulting in an average accuracy of 17.19% higher than the traditional 
models. 


Table 4. Comparison results as average scores of the classification performance 
Traditional classification model 


Performance Deep learning classification model 


SVM MLP BiGRU BiLSTM GRU LSTM 
Accuracy 0.72 0.66 0.93 0.83 0.81 0.67 
Precision 0.73 0.67 0.93 0.84 0.81 0.65 
Recall 0.72 0.66 0.93 0.83 0.81 0.67 
Fl-score 0.73 0.63 0.92 0.81 0.79 0.62 


Table 5 compares an average F1-score group by class of trading signal. The results also show that the 
four deep learning models showed a higher average F1-score than both traditional models. The BiGRU model 
obtained the best classification performance. Therefore, we select the best model to define stock trading signals 
from the stock's daily closing price based on the forecast dataset of the model created. A sample of a trading 
signal from the model's forecast dataset, as shown in Figure 4. 


Table 5. Comparison of an average Fl-score group by class of trading signal 


Class of trading signal SVM MLP BiGRU__ BiLSTM GRU LSTM 
Buy 0.79 0.71 0.94 0.81 0.88 0.66 
Hold 0.43 0.34 0.85 0.68 0.57 0.26 
Sell 0.78 0.71 0.95 0.95 0.81 0.76 


After that, we applied the model to analyze the initial investment approach using the BiGRU model, 
which has the best classification performance obtained from the experiment. Next, we focus on stock trading 
signals from each corporate's forecast dataset in 2021. Moreover, we derived trading signal points from a model 
created as shown in Figure 5 to compare yearly investment returns analysis, as represented in Figure 5(a) and 
quarterly investment returns analysis, as represented in Figure 5(b). 

Besides, we applied to analyze the investment returns by comparing each company's entire year and 
quarterly earnings and losses. We assumed that we had invested the money in the amount of 10,000 baht per 
unit. The results showed that each company's average total investment returns from every quarter were more 
than the entire year, as shown in Table 6. 

Identifying trading signals was based on historical stock price trends daily. We used a combination of 
several the stock indicator strategy technical analysis combined with corporate news data to increase the 
accuracy of daily stock price data analysis even more. Thus, the process mentioned above increases the 
classification efficiency of the model. The experimental results obtained better average classification accuracy 
than stock price analysis with natural language processing only [37] or classifying stock market price 
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movements based on deep learning methods by technical indicators only [8]. The deep learning models had 
many parameters at each layer for processing, and each model was unequal. The BiRNN model links hidden 
layers in opposing directions to the same output, so this model takes most of the parameters in training 
processing. In particular, the BiLSTM model obtained the highest number of trainable parameters, increasing 
the training time. In addition, the comparison of classification performance's average accuracy shows that 
traditional classifiers had reduced accuracy as trainable parameters grew or the model was complicated. 
However, the best average accuracy with deep learning model classifiers is the BiGRU model. The model can 
be applied to analyze the investment returns from derived trading signal points from a model created that 
investors could use to make future decisions. 
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Figure 4. A stock trading signal from forecast dataset 
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Figure 5. Trading signal points to analyze investment returns (a) yearly investment returns and 
(b) quarterly investment returns 


Table 6. Average total investment returns 
Final earn/loss (Baht) 


Investment period 


SCB LPN PTTEP IVL BEC TRUE CPF TSR 
Year 377,500 200 135,000 45,000 51,500 12,600 0 1,600 
Quarter 362,500 2,700 _ 232,500 95,000 57,000 11,400 10,000 4,800 


5. CONCLUSION 

In summary of the research, we offered a combination of machine learning-based natural language 
processing with technical analysis for stock trading. Our proposed framework uses corporate stock news that 
is fully effective with analysis combined with stock price indicator strategies. We selected eight stock and 
corporate news closing prices from the Thai Industry Group Index and Sector Index from 01-01-2019 to 31- 
12-2021 to classify stock trading signals, estimate the classification efficiency of the models and analyze the 
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investment returns. We used classification models to compare two traditional models (SVM and GRU) with 
four deep learning models (LSTM, GRU, BiLSTM, BiGRU). Furthermore, we used classification performance 
evaluations of the models with accuracy, precision, recall, and Fl-score. In addition, we added new words and 
selected appropriate ones into the Thai corpus to optimize the Thai word tokenization process. Identifying 
trading signals was based on historical stock price trends daily. In addition, we should use several of the stock 
indicator strategy technical analysis to increase accuracy in the daily stock price data analysis even more. Thus, 
the process mentioned above will increase the classification efficiency of the model and can also be used to 
analyze the investment returns. 

Experimentally, the results show that the average accuracy of classification efficiency of deep learning 
models with 17.19% different from traditional models, with the highest average accuracy (0.93) for BiGRU 
models. Moreover, the experimental outcomes confirmed that the dataset combination process using relevant 
corporate news text combined with the stock indicators co-occurs for stock trading classification and provides 
more accuracy than the model based on corporate news or stock indicators only. Consequently, the BiGRU 
model was appropriate for our experiments to derive trading signal points from a model created to compare 
yearly and quarterly investment returns analysis. The result shows that the average total investment returns 
from every quarter were more than the entire year. 
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